big.matrix
by reading from a
suitably-formatted ASCII file, or
write the contents of a big.matrix
to a file.write.big.matrix(x, filename, row.names = FALSE,
col.names = FALSE, sep=',')
read.big.matrix(filename, sep = ',', header = FALSE,
col.names = NULL, row.names = NULL,
has.row.names=FALSE, ignore.row.names=FALSE,
type = NA, skip = 0, separated = FALSE,
backingfile = NULL, backingpath = NULL,
descriptorfile = NULL, binarydescriptor=FALSE,
extraCols = NULL, shared=TRUE)
big.matrix
.TRUE
, the first line (after a possible skip) should contain column names.TRUE
, then the first column contains row names.TRUE
when has.row.names==TRUE
, the row names will be ignored."integer"
for example.x
.attach.big.matrix
; if NULL
of FALSE
, the dput()
TRUE
, the resulting big.matrix
can be
shared across processes.big.matrix
object is returned by read.big.matrix
, while
write.big.matrix
creates an output file (a path could be part of filename
).integer
, for example). You, the user, should know whether
your file has row and/or column names, and various combinations of options
should be helpful in obtaining the desired behavior.When reading from a file, if type
is not specified we try to
make a reasonable guess for you without
making any guarantees at this point.
Unless you have really large integer values, we recommend
you consider "short"
. If you have something that is essentially
categorical, you might even be able use "char"
, with huge memory
savings for large data sets.
Any non-numeric entry will be ignored and replaced with NA
,
so reading something that traditionally would be a data.frame
won't cause an error. A warning is issued.
Wishlist: we'd like to provide an option to ignore specified columns while doing reads. Or perhaps to specify columns targeted for factor or character conversion to numeric values. Would you use such features? Email us and let us know!
big.matrix
# Without specifying the type, this big.matrix x will hold integers.
x <- as.big.matrix(matrix(1:10, 5, 2))
x[2,2] <- NA
x[,]
write.big.matrix(x, "foo.txt")
# Just for fun, I'll read it back in as character (1-byte integers):
y <- read.big.matrix("foo.txt", type="char")
y[,]
# Other examples:
w <- as.big.matrix(matrix(1:10, 5, 2), type='double')
w[1,2] <- NA
w[2,2] <- -Inf
w[3,2] <- Inf
w[4,2] <- NaN
w[,]
write.big.matrix(w, "bar.txt")
w <- read.big.matrix("bar.txt", type="double")
w[,]
w <- read.big.matrix("bar.txt", type="short")
w[,]
# Another example using row names (which we don't like).
x <- as.big.matrix(as.matrix(iris), type='double')
rownames(x) <- as.character(1:nrow(x))
head(x)
write.big.matrix(x, 'IrisData.txt', col.names=TRUE, row.names=TRUE)
y <- read.big.matrix("IrisData.txt", header=TRUE, has.row.names=TRUE)
head(y)
# The following would fail with a dimension mismatch:
if (FALSE) y <- read.big.matrix("IrisData.txt", header=TRUE)
Run the code above in your browser using DataLab